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Estimating the effort and quality of a system is a critical step at the beginning of every software 
project. It is necessary to have reliable ways of calculating these measures, and, it is even better 
when the calculation can be done as early as possible in the development life-cycle. 

Having this in mind, metrics for formal specifications are examined with a view to correlations 
to complexity and quality-based code measures. A case study, based on a Z specification and its 
implementation in ADA, analyzes the practicability of these metrics as predictors. 

1 Introduction 

Recent studies in the areas of software metrics and project management have stimulated a lot of ideas 
of how development effort can be estimated and which metrics are of relevance (HQS [HI . Basically, 
they all suggest that the collection of data and the estimation process should be performed as early and 
as objectively as possible - so why not taking a closer look at properties of formal specifications? 

To the best of our knowledge, the only publicly available case study that took a closer look at cor- 
relations between specifications and implementations was conducted by Samson, Nevill and Dugard in 
1987 [ 17 ]. The authors used Modula-2 modules and a HOPE specification to show that there is a corre- 
lation between the number of equations in HOPE and the number of lines of source code and cyclomatic 
complexity in the Modula-2 modules. However, the authors admit that the study is relatively small-scale 
as their data is based on only 9 experimental subjects. 

The objective of this paper is now to shed some more light onto the question whether specifications' 
properties can help predicting attributes of derived implementations or not. For this, the following strat- 
egy is pursued: firstly, based on a set of well-known measures, it tries to find out whether some of the 
measures are correlated or not. Secondly, it suggests a prediction model for some of the measures. A 
case study, based on the specification and implementation of the Tokeneer system [5] forms the basis 
for these considerations. It takes the Z specification of the system and its implementation in ADA as the 
point of departure and identifies those parts of the code that unambiguously implement specific parts of 
the specification. After that, it calculates size, structure and quality related measures for both of the doc- 
uments. Finally, it looks for correlations between the measures, and, based on the findings, it calculates 
a prediction model for several ADA-based size- and complexity-related measures. 

This paper is structured as follows: Section [2]briefiy introduces the code and specification measures 
that are used in the study. Section [3] presents the setting of the study, the experimental subject and the 
statistical tests used. Next, Section @] presents and discusses the results of the correlation tests, and 
Section [5] presents the prediction model. Section [6] discusses possible threats to validity, and, finally, 
Section [7] summarizes the findings and discusses possible steps to be done next. 
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2 Measures 

This section introduces the measures used for assessing the Z specification and its implementation in 
ADA. Please note that, due to limitations of space, only a brief overview of the measures is provided^- 

2.1 Code-based Measures 

The implementation language of the Tokeneer specification @ is ADA. In his master thesis, Tabareh 
l20l took a look at currently available environments that are able to generate practical measures from 
ADA code. He suggests to apply the Understand tool and uses the following measures (where M denotes 
either an ADA function or a procedure jl for a preliminary study comparing ADA and Z-based measures: 

• CountLine CL(M). It counts the number of physical lines. 

• CountLineCode CLC(M). It counts the number of lines that contain source code. 

• CountLineExecutable CLCE{M). It counts the number of lines containing executable ADA code. 

• CountLineCodeDecl CLCD(M). It count the number of lines containing declarative ADA code. 

• Knots Count KNOTS(M) : It is a measure for the structuredness of a module and counts overlapping 
jumps in the program flow graph. 

• Cyclomatic Complexity CYC(M). It measures the maximum number of linearly independent paths 
through a program and is extracted by counting the minimum set of paths which can be used to 
construct all other paths through the graph. 

In order to focus even more on structural properties of the code, this study additionally makes use of 
Shepperd and Ince's Information flow count [18]. The general idea is that the complexity of a module is 
related to the number of flows or channels of information between the module and its environment. For 
this, the Understand tool can be used to generate the call-graph, and the flow of data and control then 
forms the basis for the calculation of the Sheppard Information Flow SI of a module M: 

• Fan-in {FIN{M))\ It comprises the number of data-flows terminating at a component M. 

• Fan-out {FOUT{M))\ It comprises the number of data-flows originating at a component M. 

• Information Flow(»S7(M)): It comprises the number of information flows related to a component 
M and is calculated via (FIN(M) *FOUT(M)) 2 . 

2.2 Specification Measures 

Most of the complexity measures for formal specifications focus on size. The reasons are that size-based 
measures (like lines of specification text) are easy to calculate and yield a single number that is easy to 
inteipret. This is not so much the case for structure- and quality-related measures. Their calculation is 
usually based on the notion of control and data dependencies, concepts that are not necessarily dom- 
inant principles of a specification language. However, several authors |4] [121 El demonstrated that a 
reconstruction of these dependencies is possible. 

Recently, Bollin showed that coupling and cohesion based measures can reasonably be mapped to 
formal Z specifications 0. The basis for the calculation of all the measures is a graph that contains 
vertices (called primes) for all predicates and declarations of the specification, and arcs representing 
(reconstructed) control and data dependencies [2]. With such a graph as a basis, the following measures 
(defined for schemas y that are part of a specifications *¥) are used in the remainder of this work: 

'An in-depth discussion of other specification-based measures can be found in the Ph.D. thesis of Bollin [ 1 1. 
2 The tool and a description of the measures can be found at the Understand homepage at www.scitools.com. Page last 
visited: May 2012. 
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• Conceptual Complexity CC{y): The conceptual complexity equals the total number of prime 
vertices in the graph (representing a schema y). 

• Logical Complexity v'{y) = (/, u): The lower bound value / of the measure is 1 plus the number 
of primes that are terminal vertices of control dependency arcs. It can be compared to counting 
the number of decision statements in programs. The upper bound value u equals 1 plus the total 
number of control dependencies. It reflects the total amount of dependencies to be considered. 

• Definition-Use Count: DU{y): This measure equals the total number of data dependencies. 

• Use Count USE(y): The use count equals the number of identifiers used in the schema y. 

• Definition Count DEF(y): The definition count equals the number of identifiers referring to an 
after state in the schema y of the specification. 

• And-Count AND(y): This measure equals the number of AND-combined predicates in y. 

• Or-Count OR(y): This measure equals the number of OR-combined predicates in y. 

Semantics-based measures can be calculated by generating slices. The idea goes back to the work 
of Weiser [2JL] who introduced five slice-based measures for cohesion: Tightness, Coverage, Overlap, 
Parallelism and Clustering. Ott and Thuss lfl4ll partly formalized these measures. Coupling, on the other 
hand, was originally defined as the number of local information flow entering (fan-in) and leaving (fan- 
out) a procedure JH. Harman et. al [7 ] demonstrate that it can also be calculated via slicing. Mapping 
and evaluating their approaches to Z leads to the following set of quality -based specification metrics : 

• Coverage Cov(y): It measures the compactness of a schema by comparing the length of all possi- 
ble slices to the length of the specification schema y. 

• Overlap 0(y): It measures the conciseness of a schema y by counting those statements that are 
common to all of the possible slices and relates the number to the size of all slices. 

• Schema Coupling #(*P, yf): It is the weighted measure of the information flow between a given 
schema yi and all other schemas in *P. 

3 The Study 

The study is split into two parts and it aims at answering the following two questions: (a) what type of 
correlations exists between specification-based and code-based measures, and (b) is it possible to predict 
code-based measures from specification-based measures? 

The Tokeneer system |f5j is one of the rare, industrial-size and publicly available, formal Z specifi- 
cations that comes along with a fully derived implementation. It has been developed by Praxis and the 
NSA and provides a specification for an identification station consisting of a fingerprint reader, a display 
and a card reader. The code, written in ADA, consists of 11,807 lines of executable ADA code (34,769 
lines including comments). The exceptional feature of the ADA files is that they contain so-called "trace 
unit" comments which are direct links to the corresponding sections in the formal design document, thus 
linking specification text (schemas) and implementation code pairs (procedures and functions) unam- 
biguously together. The Z specification consists of 11,356 lines of text, including 4,808 lines of spec- 
ification text. The specification itself contains 3,295 declarations and predicates, it contains 132,088 
control dependencies and 6, 145 data-dependencies. 

The subjects for this study are set of pairs of code modules (procedures and functions) and their 
related Z specification (schemas jH However, the mapping is not always one-to-one, and it is also not 

3 The set of experimental subjects can be found in an Appendix (containing relevant background materials) at the ViZ (2) 
homepage via the link http://viz.uni-klu.ac.at/images/research/materials/fmdsl2-addon.pdf. Page last visited: May 2012. 
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total. There are a couple of trace-units that do not have a corresponding part in the implementation, and 
there are also links to trace-units that are non-existent. Thus, as a first step in the preparation phase of this 
study, a small script was written for matching the references and units automatically, sorting out spelling 
errors and dangling links. Then, the result of the mapping has been verified and cross-checked by hand. 
This process yielded 70 units with a traceable transformation of Z code to ADA code. 

The first part of the study deals with the question of relatedness between specification-code pair mea- 
sures. As we do not know whether the measures are normally distributed, three different statistical tests 
are used to assess the data: the Pearson's Correlation Coefficient, the Spearman's Rank Correlation Co- 
efficient, and Kendall's Tau Correlation Coefficient. The Pearson's correlation coefficient (Rp) measures 
the degree of association between the variables, assuming normal distribution of the values lfl6l p. 212]. 
Though this test might not necessarily fail when the data is not normally distributed, the Pearson's test 
only looks for a linear correlation. It might indicate no correlation even if the data is correlated in a non- 
linear manner. As the data might not be normally distributed, the Spearman's rank correlation coefficient 
(Rs) has been chosen |'16l p. 219]. It is a non-parametric test of correlation and assesses how well a 
monotonic function describes the association between the variables. This is done by ranking the sample 
data separately for each variable. Finally, the Kendall's robust correlation coefficient (Rk) is used as an 
alternative to the Spearman's test (U p. 200]. It is also non-parametric and investigates the relationship 
among pairs of data. However, it ranks the data relatively and is able to identify partial correlations. 

When a value of | R \ € [0.8, 1.0] then it is interpreted to indicate a strong association. When | R | G 
[0.5, 0.8) it is interpreted to indicate a moderate association. When | R | G [0.0, 0.5) it is interpreted to 
indicate a weak association (values rounded to the third decimal place). 

4 Correlation Tests 

After data preparation, the study looked for linear or at least partial correlations between the sets of 
measures. At first, classical size-based measures are considered, and Table Q] (upper part) summarizes 
the results. The p-values for testing the hypothesis of no correlation against the alternative that there is a 
nonzero correlation are less than 0.05 for all tests. The table also shows that there is a moderate to strong 
relation between CC(y) and the measures of CL{M), KNOTS(M) and FOUT{M). The correlation val- 
ues of the tests are quite similar, but there are a couple of exceptions. Compared to the Pearson test, the 
Spearman's rank test shows a higher correlation between most of the size-based measures and the Count 
Line Declarative CLCD(M) measure, indicating that there might be a non-linear correlation between 
them. However, the Kendall's test shows weak correlation for most of the measures again. A similar 
situation can be observed for the measure of FIN(M). Here, the Spearman test shows a slightly higher 
correlation than the other two tests, but it still falls into the weak association class. Interesting are the 
differences between the tests for the SI(M) measure. The correlation to the size-based measures is not 
strong, but SI(M) is calculated by also using the square of FOUT(M), and this non-linear tendency can 
be seen in the slightly higher values of the Spearman tests. And yet another issue can be observed: cy- 
clomatic complexity is (although only moderately) influenced by the number of logical OR connections 
in the specification. As cyclomatic complexity is related to the number of paths through the program, 
this observation seems also to be consistent to the use of or-combined predicates in a Z specification. 

In a second step, structure-based measures have been looked at. Table Q] (lower part) summarizes 
the results. Again, the p-values are less than 0.05 for all tests. The correlations are not as strong as 
with the pure size-based measures - with one exception: the structure-based measures seem to strongly 
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Pearson Correlation Rp, n = 10,p< 0.033) 


Measure 


CUM) 


CLC(M) 


CLCD(M) 


CLCE(M) 


CYC(M) 


KNOTS(M) 


FIN(M) 


FOUT(M) 


SI(M) 


CC(y) 


0.806 


0.681 


0.343 


0.797 


0.749 


0.842 


0.258 


0.866 


0.255 


AND(y) 


0.538 


0.507 


0.369 


0.519 


0.586 


0.477 


0.366 


0.482 


0.384 


OR(W) 


0.450 


0.616 


0.435 


0.640 


0.697 


0.490 


0.456 


0.628 


0.481 


Spearman's Rank Correlation Rs, n = 70, p < 0.016) 


CC(v) 


0.784 


0.742 


0.653 


0.797 


0.770 


0.783 


0.418 


0.849 


0.615 


AND(y) 


0.398 


0.428 


0.406 


0.457 


0.454 


0.425 


0.286 


0.452 


0.343 


OR(V) 


0.697 


0.731 


0.699 


0.760 


0.748 


0.704 


0.497 


0.774 


0.619 


Kendall Robust Correlation R K , n = 70, p < 0.025) 


CC(v) 


0.586 


0.544 


0.462 


0.595 


0.577 


0.623 


0.300 


0.686 


0.448 


AND(y) 


0.289 


0.308 


0.286 


0.343 


0.334 


0.341 


0.200 


0.356 


0.241 


OR{w) 


0.553 


0.596 


0.575 


0.629 


0.629 


0.563 


0.404 


0.654 


0.507 




Pearson Correlation R F , n = 70,p< 0.028) 


Measure 


CUM) 


CLC(M) 


CLCD(M) 


CLCE(M) 


CYC(M) 


KNOTS(M) 


FIN(M) 


FOUT(M) 


SI(M) 


v5(v) 


0.789 


0.676 


0.382 


0.764 


0.743 


0.793 


0.262 


0.813 


0.264 


vL(v) 


0.787 


0.661 


0.358 


0.758 


0.739 


0.794 


0.259 


0.808 


0.262 


du(y) 


0.799 


0.642 


0.285 


0.777 


0.736 


0.817 


0.279 


0.833 


0.282 


Spearman's Rank Correlation Rs, n = 70, p < 0.002) 




0.782 


0.727 


0.638 


0.785 


0.753 


0.775 


0.416 


0.838 


0.603 




0.764 


0.695 


0.602 


0.762 


0.725 


0.785 


0.363 


0.824 


0.556 




0.767 


0.682 


0.593 


0.777 


0.728 


0.784 


0.377 


0.832 


0.600 


Kendall Robust Correlation R K , n = 70,p< 0.003) 


v,(v) 


0.603 


0.543 


0.460 


0.602 


0.583 


0.628 


0.305 


0.691 


0.441 


vL(v) 


0.565 


0.502 


0.431 


0.552 


0.533 


0.619 


0.262 


0.655 


0.402 




0.567 


0.518 


0.431 


0.584 


0.558 


0.613 


0.280 


0.659 


0.430 



Table 1: Pearson's, Spearman's and Kendall's correlation for size- and structure based Z measures. 

influence the FOUT(M) count. The other structure-based measures moderately to strongly influence the 
complexity measures CYC(M) and KNOTS (M). This seems to be inherent, as these measures are count- 
ing dependencies within and between the schemas. The correlation to the other ADA -related measures in 
also moderate to strong. Only the measures of CLCD(M), FIN(M) and SI(M) do have weak correlations. 

In the case of semantics-based measures the picture has to be looked at in a more differentiated way 
(see Table©. At first, most of the results of the tests concerning Coverage are statistically not significant 
(higher p values are shown in bold numerals). The tests indicate no correlation between the ADA-based 
measures and Coverage, but the chance is high that this is wrong. In this situation scatter plots have been 
used to gain a better understanding of the results, but the plots confirmed the results of no correlation at 
all. The other tests indicate weak to moderate relations for Overlap and Coupling, but another point is 
interesting. Overlap and Coupling have different leading signs. This might partially be explained by the 
fact that overlap is an indicator for crispness. It is high when the schema is not strongly related to other 
parts of the specification. And coupling is higher when there are more relations to other specification 
schemas. An increase in the value of one measure leads to a decrease of the other measure. 

To summarize, there is only weak to moderate relation between the Z-based measures and CLCD(M), 
FIN{M), and SI(M). But, though not exclusively, there is some moderate to strong correlation between 
the Z-based and the other implementation-based measures. When just focusing, for example, on CL(M), 
CYC(M), KNOTS(M), and FOUT(M) and taking moderate to strong correlations into account, then the 
following can be observed: Firstly, they are all influenced by structure-based measures. Secondly, espe- 
cially CL(M), KNOT(M) and FOUT(M) do have strong correlations to the Z measures. The next section 
now uses the Z measures to provide regression formulas for the most suitable ADA-based measures. 
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Semantics-based Correlation, n = 70 





Pearson 


Spearman 


Kendall 


Cov(vO 


o(v0 


%W) 


Cov(vO 


o(vo 


x(w) 


Cov(ijr) 


0(1?) 


x(v) 


CL(M) 


R 


0.104 


-.623 


0.646 


0.054 


-.616 


0.686 


0.042 


-.495 


0.480 


P 


0.391 


0.000 


0.000 


0.660 


0.000 


0.000 


0.624 


0.000 


0.000 


CLC(M) 


R 


0.184 


-.466 


0.414 


0.186 


-.480 


0.541 


0.146 


-.364 


0.379 


P 


0.127 


0.000 


0.000 


0.124 


0.000 


0.000 


0.085 


0.000 


0.000 


CLCD(M) 


R 


0.229 


-.215 


0.116 


0.243 


-.369 


0.433 


0.190 


-.280 


0.310 


P 


0.057 


0.075 


0.340 


0.042 


0.002 


0.000 


0.026 


0.003 


0.000 


CLCE(M) 


R 


0.126 


-.559 


0.546 


0.110 


-.587 


0.636 


0.079 


-.461 


0.440 


n 
V 


0.300 


0.000 


0.000 


0.363 


0.000 


0.000 


0.355 


0.000 


0.000 


CYC(M) 


R 


0.148 


-.534 


0.509 


0.137 


-.531 


0.590 


0.103 


-.416 


0.412 


P 


0.221 


0.000 


0.000 


0.258 


0.000 


0.000 


0.234 


0.000 


0.000 


KNOTS(M) 


R 


0.062 


-.587 


0.637 


-.018 


-.629 


0.721 


-.034 


-.530 


0.542 


P 


0.613 


0.000 


0.000 


0.884 


0.000 


0.000 


0.708 


0.000 


0.000 


FIN(M) 


R 


0.150 


-.224 


0.212 


0.245 


-.170 


0.267 


0.184 


-.132 


0.193 


P 


0.216 


0.062 


0.078 


0.041 


0.160 


0.026 


0.034 


0.164 


0.026 


FOUT(M) 


R 


0.096 


-.572 


0.582 


0.046 


-.599 


0.722 


-.008 


-.479 


0.541 


P 


0.430 


0.000 


0.000 


0.704 


0.000 


0.000 


0.930 


0.000 


0.000 


SI(M) 


R 


0.123 


-.235 


0.227 


0.220 


-.406 


0.480 


0.159 


-.315 


0.339 


P 


0.309 


0.050 


0.059 


0.067 


0.000 


0.000 


0.063 


0.001 


0.000 



Table 2: Pearson, Spearman and Kendall for semantics-based measures. 



5 Prediction Models 

According to a rule of thumb in regression ifTOl p.3], the appropriate number of independent variables 
for a prediction is not more than one fifth of the sample size. Thus, the eleven Z measures presented in 
Section [2] can be considered to be sufficient and they are all selected to form the maximum model for 
70 observations in this study. Among several systematic methods for restricting the maximum model, a 
backward elimination procedure [ 10, p.8] with a threshold of 0.4 for the P-values is selected. This means 
that a maximum regression model with all eleven independent variables is built. Then all the variables 
with a P-value of more than 0.4 are eliminated. Then, again, another regression model with the reduced 
number of variables is built, iterating until there is no variable with a P-value higher than 0.4. 

Table [3] summarizes the final result of this procedure for the five remaining code metrics (as mea- 
sures with a P- Value higher than 0.4 have been eliminated). The table, for example, shows that for the 
calculation of the cyclomatic complexity CYC(M) of an ADA module, CC(y), Cov(\jf) and OR(\j/) are 
best for being used in the regression formula. The level of confidence can be explained by the values of 
Significance-F. If the level of acceptable confidence should be 95% and higher, then all the code metrics 
with F-values of less than 0.05 can be considered predictable using the metrics in Z. All the values for 
F in Table [3] show that there is a high reliability on the results of the regressions. The value of Adjusted 
R-Square can be interpreted as an indicator for the precision level of the prediction. In our case the values 
are between 0.620 and 0.840, indicating that the regression models are relatively precise for FOUT(M), 
KNOTS (M) and CL(M) and even more precise for CLE(M) and CYC(M). With these values at hand it 
makes sense to predict code metrics, and the resulting formulas are as follows: 

CL[M) = 3.099CC(y/) - 1.237t/S£(y/) + 2.551 AND{y) -41.7350fl(y/) -9.873 
CLCE(M) = 0.516CC(v/) - 0.003v^(v/) - OAllDEFiy/) +5A5WR{y) +5.819 
CYC(M) = 0.015CC(y/) +4.349Cov(i^) -2.107<9(y/) + l.O820fl(» + 1.666 
KNOTS(M) = 0.121CCO) -0.001v' M O) -0.017t/S£(y) -0.092DEF(\jf) + 0.027 AND (y) -0.882 
FOUT{M) = 0.198CC( yf) - OAOlv'^Y) ~ 0.001 v' u {y) - 0.2UDEF(y) + 1 .220OR(\i/) + 0.344 
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Results of the backward elimination procedure (values P < 0.4) 



Paramter 


CL(M) 


CLE(M) 


CYC(M) 


KNOTS(M) 


FOUT(M) 


Adjusted R-Square 


0.720 


0.680 


0.620 


0.760 


0.840 


Significance F 


5E-18 


2E-16 


5E-14 


7E-20 


1.5E-25 


P - Value 


CC{w) 


0.001 


4E-4 


0.003 


1E-6 


3E-11 


v',(¥) 










0.270 


v'„(V) 




0.005 




0.004 


0.000 


DU(w) 












o{y) 












Cov(v/) 






0.280 


















AND(y) 


7E-5 






0.030 




OR[y) 


5E-4 


0.001 


4E-4 




6E-5 


DEF{y) 




0.070 




0.020 


1E-5 


USE{y) 


0.034 






0.320 





Table 3: Adjusted R-Square, Significance F and P- Values after applying the backward elimination pro- 
cedure for maximum model identification. P- Values higher than 0.4 are represented by dashes. 

6 Threats to Validity 

With the results of the study the question of validity arises. Considering internal validity, single group 
and multiple group threats, as well as social threats cannot arise. The only threat that might have an 
impact on the outcome of the study is the software used to generate and calculate the measures. The 
software components involved are the CZT parser [11], the slicing environment ViZ (2|, Matlab R2007b, 
Microsoft Excel 2010 and SPSS 14:0. Excel is a standard spreadsheet application. Matlab and SPSS are 
numerical computer environments used for the statistical analysis. Both tools have been used alternately 
to verify the results of the analysis. It is very unlikely that the data from both environments is erroneous. 
The CZT parser is being developed as a SourceForge project since 2003 and it is available in a stable 
release. The slicing environment ViZ has been developed in the year 2003 and it is also part of a couple 
of extensions which led to systematic validations during development. 

Concerning the selection validity, the publicly available schemas and ADA procedures and functions 
have been chosen with care, following the links provided by the developers. It is important to note that 
the specification used in this study had to be modified a bit in order to be accepted by the CZT parser. 
This meant to introduce some hard spaces and, eventually, also to replace the "=" symbol by the "==" 
sign. In order to rule out the possibility of coincidental changes of line breaks or identifier names, both 
files were again compared afterwards, using a professional file-compare software. 

7 Conclusion 

In this study, consisting of 70 experimental subjects, the feasibility of confidently predicting software 
measures based on formal specifications has been demonstrated. The correlations found between the 
different size-, structure-, and semantics-based measures and the implementation metrics promise of 
being able to predict size and complexity attributes as well as enables to estimate likely costs and efforts. 

The study describes only the first link in the chain of associations between the documents created 
during software development, but it confirms the observations of Samson et.al. [ 17] who conducted a 
similar study (with 9 experimental subjects) several years ago. Specification-based measures are not 
difficult to calculate, thus they can and also should be collected at the beginning of a project. The results 
of the study indicate that it pays off. 
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